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Error code Error Description 

W 213 Artificial or Unknown found in <213> in SEQ ID (3) 

W 213 Artificial or Unknown found in <213> in SEQ ID (4) 

W 402 Undefined organism found in <213> in SEQ ID (13) 

W 402 Undefined organism found in <213> in SEQ ID (14) 

W 402 Undefined organism found in <213> in SEQ ID (15) 

W 402 Undefined organism found in <213> in SEQ ID (16) 

W 213 Artificial or Unknown found in <213> in SEQ ID (49) 

W 213 Artificial or Unknown found in <213> in SEQ ID (50) 

W 213 Artificial or Unknown found in <213> in SEQ ID (51) 

W 213 Artificial or Unknown found in <213> in SEQ ID (52) 

W 213 Artificial or Unknown found in <213> in SEQ ID (53) 

W 213 Artificial or Unknown found in <213> in SEQ ID (54) 

W 213 Artificial or Unknown found in <213> in SEQ ID (55) 

W 213 Artificial or Unknown found in <213> in SEQ ID (56) 

W 213 Artificial or Unknown found in <213> in SEQ ID (57) 

W 213 Artificial or Unknown found in <213> in SEQ ID (58) 

W 213 Artificial or Unknown found in <213> in SEQ ID (69) 

W 213 Artificial or Unknown found in <213> in SEQ ID (70) 

W 213 Artificial or Unknown found in <213> in SEQ ID (71) 

W 213 Artificial or Unknown found in <213> in SEQ ID (72) 



Input Set: 



Output Set: 



Started: 2010-07-22 15:40:42.953 

Finished: 2010-07-22 15:40:50.330 

Elapsed: 0 hr(s) 0 min(s) 7 sec(s) 377 ms 

Total Warnings: 12 6 

Total Errors: 0 

No. of SeqIDs Defined: 17 9 
Actual SeqID Count: 17 9 

Error code Error Description 

W 213 Artificial or Unknown found in <213> in SEQ ID (73) 

W 213 Artificial or Unknown found in <213> in SEQ ID (74) 

W 213 Artificial or Unknown found in <213> in SEQ ID (75) 

W 213 Artificial or Unknown found in <213> in SEQ ID (76) 

This error has occured more than 2 0 times , will not be displayed 



<210> 1 

<211> 1284 

<212> DNA 

<213> Escherichia coli 



<220> 

<221> CDS 

<222> (1) . . (1281) 

<223> coding for cytosine deaminase (codA) 



<400> 1 

gtg teg aat aac get tta caa aca att att aac gec egg tta cca ggc 48 

Val Ser Asn Asn Ala Leu Gin Thr lie lie Asn Ala Arg Leu Pro Gly 

15 10 15 

gaa gag ggg ctg tgg cag att cat ctg cag gac gga aaa ate age gee 9 6 

Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys lie Ser Ala 

20 25 30 

att gat gcg caa tec ggc gtg atg ccc ata act gaa aac age ctg gat 144 

lie Asp Ala Gin Ser Gly Val Met Pro lie Thr Glu Asn Ser Leu Asp 

35 40 45 

gee gaa caa ggt tta gtt ata ccg ccg ttt gtg gag cca cat att cac 192 

Ala Glu Gin Gly Leu Val lie Pro Pro Phe Val Glu Pro His lie His 

50 55 60 

ctg gac acc acg caa acc gec gga caa ccg aac tgg aat cag tec ggc 240 

Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 

65 70 75 80 

acg ctg ttt gaa ggc att gaa cgc tgg gec gag cgc aaa gcg tta tta 288 

Thr Leu Phe Glu Gly lie Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 

85 90 95 

acc cat gac gat gtg aaa caa cgc gca tgg caa acg ctg aaa tgg cag 336 

Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 

100 105 110 

att gec aac ggc att cag cat gtg cgt acc cat gtc gat gtt teg gat 384 

lie Ala Asn Gly lie Gin His Val Arg Thr His Val Asp Val Ser Asp 

115 120 125 

gca acg eta act gcg ctg aaa gca atg ctg gaa gtg aag cag gaa gtc 432 

Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 

130 135 140 

gcg ccg tgg att gat ctg caa ate gtc gec ttc cct cag gaa ggg att 480 

Ala Pro Trp lie Asp Leu Gin lie Val Ala Phe Pro Gin Glu Gly lie 

145 150 155 160 

ttg teg tat ccc aac ggt gaa gcg ttg ctg gaa gag gcg tta cgc tta 528 

Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

ggg gca gat gta gtg ggg gcg att ccg cat ttt gaa ttt acc cgt gaa 576 

Gly Ala Asp Val Val Gly Ala lie Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 

tac ggc gtg gag teg ctg cat aaa acc ttc gec ctg gcg caa aaa tac 624 

Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 

195 200 205 

gac cgt etc ate gac gtt cac tgt gat gag ate gat gac gag cag teg 672 

Asp Arg Leu lie Asp Val His Cys Asp Glu lie Asp Asp Glu Gin Ser 

210 215 220 

cgc ttt gtc gaa acc gtt get gec ctg gcg cac cat gaa ggc atg ggc 720 

Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 

225 230 235 240 

gcg cga gtc acc gee age cac acc acg gca atg cac tec tat aac ggg 7 68 



Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 

245 250 255 

gcg tat acc tea cgc ctg ttc cgc ttg ctg aaa atg tec ggt att aac 816 

Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly lie Asn 

260 265 270 

ttt gtc gec aac ccg ctg gtc aat att cat ctg caa gga cgt ttc gat 8 64 

Phe Val Ala Asn Pro Leu Val Asn lie His Leu Gin Gly Arg Phe Asp 

275 280 285 

acg tat cca aaa cgt cgc ggc ate acg cgc gtt aaa gag atg ctg gag 912 

Thr Tyr Pro Lys Arg Arg Gly lie Thr Arg Val Lys Glu Met Leu Glu 

290 295 300 

tec ggc att aac gtc tgc ttt ggt cac gat gat gtc ttc gat ccg tgg 960 

Ser Gly lie Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 

305 310 315 320 

tat ccg ctg gga acg gcg aat atg ctg caa gtg ctg cat atg ggg ctg 1008 

Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 

325 330 335 

cat gtt tgc cag ttg atg ggc tac ggg cag att aac gat ggc ctg aat 1056 

His Val Cys Gin Leu Met Gly Tyr Gly Gin lie Asn Asp Gly Leu Asn 

340 345 350 

tta ate acc cac cac age gca agg acg ttg aat ttg cag gat tac ggc 1104 

Leu lie Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 

355 360 365 

att gec gec gga aac age gee aac ctg att ate ctg ccg get gaa aat 1152 

lie Ala Ala Gly Asn Ser Ala Asn Leu lie lie Leu Pro Ala Glu Asn 

370 375 380 

ggg ttt gat gcg ctg cgc cgt cag gtt ccg gta cgt tat teg gta cgt 1200 

Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 

385 390 395 400 

ggc ggc aag gtg att gec age aca caa ccg gca caa acc acc gta tat 1248 

Gly Gly Lys Val lie Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 

405 410 415 

ctg gag cag cca gaa gec ate gat tac aaa cgt tga 1284 

Leu Glu Gin Pro Glu Ala lie Asp Tyr Lys Arg 

420 425 



<210> 2 
<211> 427 
<212> PRT 

<213> Escherichia coli 

<400> 2 

Val Ser Asn Asn Ala Leu Gin Thr lie lie Asn Ala Arg Leu Pro Gly 

15 10 15 

Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys lie Ser Ala 

20 25 30 

lie Asp Ala Gin Ser Gly Val Met Pro lie Thr Glu Asn Ser Leu Asp 

35 40 45 

Ala Glu Gin Gly Leu Val lie Pro Pro Phe Val Glu Pro His lie His 

50 55 60 

Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

Thr Leu Phe Glu Gly lie Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 

85 90 95 

Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 
100 105 110 



lie Ala Asn Gly lie Gin His Val Arg Thr His Val Asp Val Ser Asp 

115 120 125 

Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 

130 135 140 

Ala Pro Trp lie Asp Leu Gin lie Val Ala Phe Pro Gin Glu Gly lie 
145 150 155 160 

Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

Gly Ala Asp Val Val Gly Ala lie Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 

Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 

195 200 205 

Asp Arg Leu lie Asp Val His Cys Asp Glu lie Asp Asp Glu Gin Ser 

210 215 220 

Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 

245 250 255 

Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly lie Asn 

260 265 270 

Phe Val Ala Asn Pro Leu Val Asn lie His Leu Gin Gly Arg Phe Asp 

275 280 285 

Thr Tyr Pro Lys Arg Arg Gly lie Thr Arg Val Lys Glu Met Leu Glu 

290 295 300 

Ser Gly lie Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 
305 310 315 320 

Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 

325 330 335 

His Val Cys Gin Leu Met Gly Tyr Gly Gin lie Asn Asp Gly Leu Asn 

340 345 350 

Leu lie Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 

355 360 365 

lie Ala Ala Gly Asn Ser Ala Asn Leu lie lie Leu Pro Ala Glu Asn 

370 375 380 

Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 
385 390 395 400 

Gly Gly Lys Val lie Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 

405 410 415 

Leu Glu Gin Pro Glu Ala lie Asp Tyr Lys Arg 
420 425 



<210> 3 
<211> 1284 
<212> DNA 

<213> Artificial sequence 

<220> 

<223> Description of the artificial sequence: coding for 
cytosine deaminase (codA) 

<220> 

<221> misc_f eature 
<222> (1) . . (3) 

<223> mutation of GTG to ATG start codon for expression 
in eukaryotic hosts 



<220> 

<221> CDS 

<222> (1) . . (1281) 

<223> coding for cytosine deaminase (codA) 



<400> 3 

atg teg aat aac get tta caa aca att att aac gec egg tta cca ggc 48 

Met Ser Asn Asn Ala Leu Gin Thr lie lie Asn Ala Arg Leu Pro Gly 

15 10 15 

gaa gag ggg ctg tgg cag att cat ctg cag gac gga aaa ate age gee 9 6 

Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys lie Ser Ala 

20 25 30 

att gat gcg caa tec ggc gtg atg ccc ata act gaa aac age ctg gat 144 

lie Asp Ala Gin Ser Gly Val Met Pro lie Thr Glu Asn Ser Leu Asp 

35 40 45 

gec gaa caa ggt tta gtt ata ccg ccg ttt gtg gag cca cat att cac 192 

Ala Glu Gin Gly Leu Val lie Pro Pro Phe Val Glu Pro His lie His 

50 55 60 

ctg gac acc acg caa acc gec gga caa ccg aac tgg aat cag tec ggc 240 

Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 

65 70 75 80 

acg ctg ttt gaa ggc att gaa cgc tgg gec gag cgc aaa gcg tta tta 288 

Thr Leu Phe Glu Gly lie Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 

85 90 95 

acc cat gac gat gtg aaa caa cgc gca tgg caa acg ctg aaa tgg cag 336 

Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 

100 105 110 

att gec aac ggc att cag cat gtg cgt acc cat gtc gat gtt teg gat 384 

lie Ala Asn Gly lie Gin His Val Arg Thr His Val Asp Val Ser Asp 

115 120 125 

gca acg eta act gcg ctg aaa gca atg ctg gaa gtg aag cag gaa gtc 4 32 

Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 

130 135 140 

gcg ccg tgg att gat ctg caa ate gtc gee ttc cct cag gaa ggg att 480 

Ala Pro Trp lie Asp Leu Gin lie Val Ala Phe Pro Gin Glu Gly lie 

145 150 155 160 

ttg teg tat ccc aac ggt gaa gcg ttg ctg gaa gag gcg tta cgc tta 528 

Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

ggg gca gat gta gtg ggg gcg att ccg cat ttt gaa ttt acc cgt gaa 57 6 

Gly Ala Asp Val Val Gly Ala lie Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 

tac ggc gtg gag teg ctg cat aaa acc ttc gee ctg gcg caa aaa tac 624 

Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 

195 200 205 

gac cgt etc ate gac gtt cac tgt gat gag ate gat gac gag cag teg 672 

Asp Arg Leu lie Asp Val His Cys Asp Glu lie Asp Asp Glu Gin Ser 

210 215 220 

cgc ttt gtc gaa acc gtt get gec ctg gcg cac cat gaa ggc atg ggc 720 

Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 

225 230 235 240 

gcg cga gtc acc gee age cac acc acg gca atg cac tec tat aac ggg 7 68 

Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 

245 250 255 

gcg tat acc tea cgc ctg ttc cgc ttg ctg aaa atg tec ggt att aac 816 

Ala Tyr Thr Ser Arg Leu Phe Arg Leu Leu Lys Met Ser Gly lie Asn 

260 265 270 



ttt gtc gcc aac ccg ctg gtc aat att cat ctg caa gga cgt ttc gat 8 64 

Phe Val Ala Asn Pro Leu Val Asn lie His Leu Gin Gly Arg Phe Asp 

275 280 285 

acg tat cca aaa cgt cgc ggc ate acg cgc gtt aaa gag atg ctg gag 912 

Thr Tyr Pro Lys Arg Arg Gly lie Thr Arg Val Lys Glu Met Leu Glu 

290 295 300 

tec ggc att aac gtc tgc ttt ggt cac gat gat gtc ttc gat ccg tgg 960 

Ser Gly lie Asn Val Cys Phe Gly His Asp Asp Val Phe Asp Pro Trp 

305 310 315 320 

tat ccg ctg gga acg gcg aat atg ctg caa gtg ctg cat atg ggg ctg 1008 

Tyr Pro Leu Gly Thr Ala Asn Met Leu Gin Val Leu His Met Gly Leu 

325 330 335 

cat gtt tgc cag ttg atg ggc tac ggg cag att aac gat ggc ctg aat 1056 

His Val Cys Gin Leu Met Gly Tyr Gly Gin lie Asn Asp Gly Leu Asn 

340 345 350 

tta ate acc cac cac age gca agg acg ttg aat ttg cag gat tac ggc 1104 

Leu lie Thr His His Ser Ala Arg Thr Leu Asn Leu Gin Asp Tyr Gly 

355 360 365 

att gcc gcc gga aac age gcc aac ctg att ate ctg ccg get gaa aat 1152 

lie Ala Ala Gly Asn Ser Ala Asn Leu lie lie Leu Pro Ala Glu Asn 

370 375 380 

ggg ttt gat gcg ctg cgc cgt cag gtt ccg gta cgt tat teg gta cgt 1200 

Gly Phe Asp Ala Leu Arg Arg Gin Val Pro Val Arg Tyr Ser Val Arg 

385 390 395 400 

ggc ggc aag gtg att gcc age aca caa ccg gca caa acc acc gta tat 1248 

Gly Gly Lys Val lie Ala Ser Thr Gin Pro Ala Gin Thr Thr Val Tyr 

405 410 415 

ctg gag cag cca gaa gcc ate gat tac aaa cgt tga 1284 

Leu Glu Gin Pro Glu Ala lie Asp Tyr Lys Arg 

420 425 



<210> 4 

<211> 427 

<212> PRT 

<213> Artificial sequence 

<220> 

<223> Description of the artificial sequence: coding for 
cytosine deaminase (codA) 

<400> 4 

Met Ser Asn Asn Ala Leu Gin Thr lie lie Asn Ala Arg Leu Pro Gly 

15 10 15 

Glu Glu Gly Leu Trp Gin lie His Leu Gin Asp Gly Lys lie Ser Ala 

20 25 30 

lie Asp Ala Gin Ser Gly Val Met Pro lie Thr Glu Asn Ser Leu Asp 

35 40 45 

Ala Glu Gin Gly Leu Val lie Pro Pro Phe Val Glu Pro His lie His 

50 55 60 

Leu Asp Thr Thr Gin Thr Ala Gly Gin Pro Asn Trp Asn Gin Ser Gly 
65 70 75 80 

Thr Leu Phe Glu Gly lie Glu Arg Trp Ala Glu Arg Lys Ala Leu Leu 

85 90 95 

Thr His Asp Asp Val Lys Gin Arg Ala Trp Gin Thr Leu Lys Trp Gin 

100 105 110 

lie Ala Asn Gly lie Gin His Val Arg Thr His Val Asp Val Ser Asp 



115 120 125 

Ala Thr Leu Thr Ala Leu Lys Ala Met Leu Glu Val Lys Gin Glu Val 

130 135 140 

Ala Pro Trp lie Asp Leu Gin lie Val Ala Phe Pro Gin Glu Gly lie 
145 150 155 160 

Leu Ser Tyr Pro Asn Gly Glu Ala Leu Leu Glu Glu Ala Leu Arg Leu 

165 170 175 

Gly Ala Asp Val Val Gly Ala lie Pro His Phe Glu Phe Thr Arg Glu 

180 185 190 

Tyr Gly Val Glu Ser Leu His Lys Thr Phe Ala Leu Ala Gin Lys Tyr 

195 200 205 

Asp Arg Leu lie Asp Val His Cys Asp Glu lie Asp Asp Glu Gin Ser 

210 215 220 

Arg Phe Val Glu Thr Val Ala Ala Leu Ala His His Glu Gly Met Gly 
225 230 235 240 

Ala Arg Val Thr Ala Ser His Thr Thr Ala Met His Ser Tyr Asn Gly 
2 



